AITopics | nullw null

Collaborating Authors

nullw null

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A.3 Low Dimension Before presenting the proof of Theorem 1, we provide formal statements of its Corollaries. We then bound average argument stability in terms of average regret (Lemma 5). Substituting these in the above equation gives the claimed bound. We now fill in the details. Thus, substituting the above in Eqn. ( 3) and substituting the bound from 6, we have, E [ L ( null w; D) L ( w Substituting the value of G completes the proof.

artificial intelligence, nullxnull 2, nullynull 2, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

APPENDIX: In this section, we provide the details of our implementation and proofs for reproducibility

Neural Information Processing SystemsAug-15-2025, 08:31:45 GMT

's hidden state by h Then we need to calculate the second part of Eq. Using the Bayes' theorem, we have: p In Section 4.3, we devise a Sigmoid function to adapt the γ during the supernet training, which is defined as: γ (t) = 1 Sigmoidnull ( t total epochs 2 1) b null, (19) Section 3.2 theoretically demonstrates the benefit of the proposed architecture complementation loss function,

architecture, implementation and proof, node, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Add feedback

Magnitude and Angle Dynamics in Training Single ReLU Neurons

Lee, Sangmin, Sim, Byeongsu, Ye, Jong Chul

arXiv.org Artificial IntelligenceOct-11-2022

To understand learning the dynamics of deep ReLU networks, we investigate the dynamic system of gradient flow $w(t)$ by decomposing it to magnitude $w(t)$ and angle $\phi(t):= \pi - \theta(t) $ components. In particular, for multi-layer single ReLU neurons with spherically symmetric data distribution and the square loss function, we provide upper and lower bounds for magnitude and angle components to describe the dynamics of gradient flow. Using the obtained bounds, we conclude that small scale initialization induces slow convergence speed for deep single ReLU neurons. Finally, by exploiting the relation of gradient flow and gradient descent, we extend our results to the gradient descent approach. All theoretical results are verified by experiments.

artificial intelligence, machine learning, single relu neuron, (18 more...)

arXiv.org Artificial Intelligence

2209.13394

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.57)

Add feedback

Active Learning for Identification of Linear Dynamical Systems

Wagenmaker, Andrew, Jamieson, Kevin

arXiv.org Machine LearningFeb-2-2020

We propose an algorithm to actively estimate the parameters of a linear dynamical system. Given complete control over the system's input, our algorithm adaptively chooses the inputs to accelerate estimation. We show a finite time bound quantifying the estimation rate our algorithm attains and prove matching upper and lower bounds which guarantee its asymptotic optimality, up to constants. In addition, we show that this optimal rate is unattainable when using Gaussian noise to excite the system, even with optimally tuned covariance, and analyze several examples where our algorithm provably improves over rates obtained by playing noise. Our analysis critically relies on a novel result quantifying the error in estimating the parameters of a dynamical system when arbitrary periodic inputs are being played. We conclude with numerical examples that illustrate the effectiveness of our algorithm in practice.

algorithm 1, log 1, theorem 2, (15 more...)

arXiv.org Machine Learning

2002.00495

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Middle East > Djibouti > Arta > `Arta (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Neural tangent kernels, transportation mappings, and universal approximation

Ji, Ziwei, Telgarsky, Matus, Xian, Ruicheng

arXiv.org Machine LearningOct-15-2019

This paper establishes rates of universal approximation for the shallow neural tangent kernel (NTK): network weights are only allowed microscopic changes from random initialization, which entails that activations are mostly unchanged, and the network is nearly equivalent to its linearization. Concretely, the paper has two main contributions: a generic scheme to approximate functions with the NTK by sampling from transport mappings between the initial weights and their desired values, and the construction of transport mappings via Fourier transforms. Regarding the first contribution, the proof scheme provides another perspective on how the NTK regime arises from rescaling: redundancy in the weights due to resampling allows individual weights to be scaled down. Regarding the second contribution, the most notable transport mapping asserts that roughly $1 / \delta^{10d}$ nodes are sufficient to approximate continuous functions, where $\delta$ depends on the continuity properties of the target function. By contrast, nearly the same proof yields a bound of $1 / \delta^{2d}$ for shallow ReLU networks; this gap suggests a tantalizing direction for future work, separating shallow ReLU networks and their linearization.

nullx null 1, probability, sup nullx null 1, (16 more...)

arXiv.org Machine Learning

1910.06956

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Robust stability of moving horizon estimation for nonlinear systems with bounded disturbances using adaptive arrival cost

Deniz, Nestor N., Murillo, Marina H., Sanchez, Guido, Genzelis, Lucas M., Giovanini, Leonardo

arXiv.org Artificial IntelligenceJun-3-2019

In this paper, the robust stability and convergence to the true state of moving horizon estimator based on an adaptive arrival cost are established for nonlinear detectable systems. Robust global asymptotic stability is shown for the case of non-vanishing bounded disturbances whereas the convergence to the true state is proved for the case of vanishing disturbances. Several simulations were made in order to show the estimator behaviour under different operational conditions and to compare it with the state of the art estimation methods.

artificial intelligence, estimation, estimator, (15 more...)

arXiv.org Artificial Intelligence

1906.0106

Country:

South America > Argentina (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback